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In this paper, we study the problem of pointwise estimation of a multivariate density. We provide 
a data-driven selection rule from the family of kernel estimators and derive for it a pointwise 
oracle inequality. Using the latter bound, we show that the proposed estimator is minimax and 
minimax adaptive over the scale of anisotropic Nikolskii classes. It is important to emphasize 
that our estimation method adjusts automatically to eventual independence structure of the 
underlying density. This, in its turn, allows to reduce significantly the influence of the dimension 
on the accuracy of estimation (curse of dimensionality). The main technical tools used in our 
considerations are pointwise uniform bounds of empirical processes developed recently in Lepski 
[Math. Methods Statist. 22 (2013) 83-99]. 
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1. Introduction 

Let Xi = ... ,Xi^d),i G N*, be a sequence of R'^-valued i.i.d. random vectors de¬ 

fined on a complete probability space (11,21, P) and having the density / with respect 
to the Lebesgue measure. Eurthermore, denotes the probability law of X^'^^ = 
{Xi, ..., Xn), n G N*, and is the mathematical expectation with respect to . 

Our goal is to estimate the density / at a given point xq G using the observation 
X(") = (Xi,...,X„), n € N*. As an estimator, we mean any -measurable mapping 
/: R" —>■ R and the accuracy of an estimator is measured by the pointwise risk: 

7^i«)[/,/] := (E^"^|7 (xo) -/(xo)n'/^ q>l. 

The discussion of traditional methods and a part of the vast literature on the theory 
and application of the density estimation is given by Devroye and Gydrfi [7], Silverman 
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[40] and Scott [39]. We do not pretend here to provide with a detailed overview and 
mention only the results which are relevant for considered problems. The minimax and 
adaptive minimax multivariate density estimation with Lp-loss on particular functional 
classes was studied in Bretagnolle and Huber [2], Ibragimov and Khasminskii [21, 22], 
Devroye and Lugosi [8-10], Efroimovich [13, 14], Hasminskii and Ibragimov [20], Golubev 
[19], Donoho et al. [11], Kerkyacharian, Picard and Tribouley [26], Gine and Guillou [15], 
Juditsky and Lambert-Lacroix [23], Rigollet [36], Massart [33] (Chapter 7), Samarov and 
Tsybakov [38], Birge [1], Mason [32], Gine and Nickl [16], Chacon and Duong [5] and 
Goldenshluger and Lepski [18]. In Comte and Lacour [6], the pointwise setting was first 
considered in the context of multidimensional deconvolution model. More recently, in 
Goldenshluger and Lepski [17], adaptive minimax upper bounds were proved for mul¬ 
tivariate density estimation with Lp-risks on anisotropic Nikolskii classes using a local 
(pointwise) procedure. The use of Nikolskii classes allows to consider the estimation of 
anisotropic and inhomogeneous densities; see Ibragimov and Khasminskii [22], Golden¬ 
shluger and Lepski [18] and Lepski [29]. 

In this paper, we focus on the problem of the minimax and adaptive minimax pointwise 
multivariate density estimation over the scale of anisotropic Nikolskii classes. 

Minimax estimation. In the framework of the minimax estimation, it is assumed that 
/ belongs to a certain set of functions E, and then the accuracy of an estimator / is 
measured by its maximal risk over E: 

:=sup(E^”^l7(a;o)-/(a;o)l'^)^^'^, g>l. (1) 

/es 

The objective here is to construct an estimator /* which achieves the asymptotic of the 
minimax risk (minimax rate of convergence): 

[7, E] X inf [7, E] := (^„(E). 

/ 

Here, infimum is taken over all possible estimators. 

Smoothness assumption. Let E be either Holder classes ]HI(/3, L) or Lp-Sobolev classes 
W(/?,p, L) of univariate functions. Here, /3 represents the smoothness of the underlying 
density and p is the index of the norm where the smoothness is measured. Then 

( 2 ) 

(/9„(W(/3,p, L)) = /3 > 0,1 < p < oo. 

These minimax rates can be obtained from the results developed by Donoho and Low 
[12]; see also Ibragimov and Khasminskii [21, 22], and Hasminskii and Ibragimov [20]. 

Let now E = ]HId(/3, L) where IHId(/l, L) is an anisotropic Holder class determined by the 
smoothness parameter /3 = (/3i,.. .,Pd)- In this case, 

p„(Hd(/3,L))=n-^/(2i3+i)^ 


El/ft 


— ± 


f3i>0,i= l,d. 


( 3 ) 
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The latter result can be obtained from Kerkyacharian, Lepski and Picard [24], Propo¬ 
sition 1, in the framework of the Gaussian white noise model. The similar minimax 
results will be established for pointwise multivariate density estimation in Section 3.2; 
see Theorems 2 and 3. 

It is important to emphasize that minimax rates depend heavily on the dimension d. 
Let us briefly discuss how to reduce the influence of the dimension on the accuracy of 
estimation (curse of dimensionality). The approach which have been recently proposed in 
Lepski [29] is to take into account the eventual independence structure of the underlying 
density. 

Structural assumption. Note Id the set of all subsets of {1,..., d} and tp the set of all 
partitions of {1,..., d} completed by the empty set 0. For all I € Id and x G note 
also xi = (xi)i^i, / = {1,..., d} \ /, [d] = card(/) and put 

fi{xi) ■■= f _ fix)dxj. 
jRiri 

Obviously, // is the marginal density of Xij and, to take into account the independence 
structure of the density /, we consider the following set: 

q3(/):=|lPG<P: f{x)=l[fj{xi)yxeR^ 

^ I&V 


In this paper, we focus on the problem of pointwise multivariate density estimation on 
anisotropic Nikolskii classes. In particular, we will prove that the minimax rate on the 
class N*^{13,1,1) (introduced in Lepski [29], see the definition in Section 3.1) for fixed 
P G (0,-|-oo)‘^, p G [1, -t-oo]"^, L G (0, -l-oo)'^, P G ‘P(/), are given by 


p„(N:jP,L,P)) = n 




r := 


■ f ri-Eie7l/(/3iP*)' 


If d = 1, then the structural assumption does not exist, that means formally P = SZS, 
and we come to the rates given in (2). Note that ]^(/3,L, 0) coincides with the set 
of densities belonging to ]HI(/3,L) and that N* i{P,L,0) contains the set of densities 
belonging to W(/3,p, L). 

If d > 2, Pi = oo,i = l,d, and P = 0 we find again the rates given in (3), and 
^(/3,L, 0) coincides with a set of densities belonging to Md{P,L). Note however that if 
P ^ 0 the latter rates can be essentially improved. Indeed, if, for instance, /3 = (/3,..., /3) 
and P* = {{!},..., {d}}, then r = (3 and 

^-/ 3 /( 2 / 3 +d) ^ L)) » Pn{N*^,d{P. L, P*)) = n-/3/(2/3+l). ( 4 ) 

Moreover, <Pn{N^ ^{P,L,P*)) does not depend on the dimension d. 

We remark that minimax rates (accuracy of estimation) depend heavily on the param¬ 
eters P,p and P. Their knowledge cannot be often supposed in particular practice. It 
makes necessary to find an estimator whose construction would be parameter’s free. 
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Adaptive minimax estimation. In the framework of the adaptive minimax estimation 
the underlying density / is supposed to belong to the given scale of functional classes 
{Sq,, a € A}. For instance, if Eq, = ]Hl(/3, L), a = (/?, L), or if Sq = W(/3,p, L),a = L). 

The first question arising in the framework of the adaptive approach consists in the 
following: does there exists an estimator jf* such that 

limsup{i,£j“^(a)72.^‘^^[/*,Ea]} <+00 Va € (5) 

n—)-+oo 


where Lpn{ot) is the minimax rate of convergence over E^. 

As it was shown in Lepski [31] for the Gaussian white noise model, the answer of this 
question is negative if Eq, =]HI(/3,T), a = {f3,L). Brown and Low [3] extended this result 
to the pointwise density estimation. Further Butucea [4] extended the results of Brown 
and Low [3] over the scale of Lp-Sobolev classes W(/3,p, L). In Section 3.3.2, we will prove 
that the answer is also negative for multivariate density estimation at a given point over 
the scale of anisotropic Nikolskii classes N* 

Thus, for problems in which (5) does not hold we need first to find a family of normal¬ 
izations = {lF„(EQ,),a G A} and an estimator /$ such that 

limsup{!F“^(a)7?.£®^[/$,EQ,]} <-1-00 ^aGA. (6) 

n—>-+oo 


Any family of normalizations satisfying (6) is called admissible and the estimator is 
called iF-adaptive. Next, we have to provide with the criterion of optimality allowing 
to select “the best” admissible family of normalizations, usually called adaptive rate of 
convergence. The first criterion was proposed in Lepski [31] and it was improved later in 
Tsybakov [41] and in Klutchnikoff [27]. 

In particular, in Lepski [31] and in Butucea [4], it was shown that the adaptive rate of 
convergence for the considered problem is 


'F„(H(/?,L)) = 



P/GP+G) 


fd G (0, /3max), 


PK^P+l) 


/d — jdmax 7 


i'n(W(^,P,L)) = 


/ln(n)^ (P-Gp)/(2(P-i/p)+i) 

V « 

' l\(P-l/p)/(2(P-l/p) + l) 


Id G (0, Pn 


ld = Pn 


with respect to the criterion in Lepski [31] and Tsybakov [41], respectively. Here, /3niax 
is an arbitrary positive number. 

Later Klutchnikoff [27] studied the pointwise adaptive minimax estimation over 
anisotropic Holder classes, in the Gaussian white noise model. The consideration of 
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anisotropic functional classes required to develop a new criterion of optimality. Following 
this criterion, Klutchnikoff [27] proved that the adaptive rate of convergence is 



Recently, Comte and Lacour [6] found a similar form of admissible sequence for pointwise 
adaptive minimax estimation in the deconvolution model. 

In Section 3.3, we provide with minimax adaptive estimator in pointwise multivariate 
density estimation over the scale of anisotropic Nikolskii classes. We will take into ac¬ 
count not only the approximation properties of the underlying density but the eventual 
independence structure as well. To analyze the accuracy of the proposed estimator, we 
establish so-called pointwise oracle inequality proved in Section 5.3. We will also show 
that the adaptive rate of convergence is given by 



0 < r < r, 



To assert the optimality of this family of normalizations, we generalize the criterion 
proposed in Klutchnikoff [27]; see Section 3.3.2. 

Organization of the paper. In Section 2, we provide a measurable data-driven selection 
rule based on bandwidth selection of kernel estimators and we derive an oracle-type 
inequality for the selected estimator at a given point. In Section 3, we treat the complete 
problem of minimax and adaptive minimax pointwise multivariate density estimation on 
a scale of anisotropic Nikolskii classes taking into account the independence structure of 
the underlying density. In Section 4, we briefly compare our local method with the global 
one developed in Lepski [29]. Proofs of all main results are given in Section 5. Proofs of 
technical lemmas are postponed to the Appendix. 

2. Selection rule and pointwise oracle-type inequality 

2.1. Kernel estimators related to independence structure 

Let K;R—be a fixed symmetric kernel satisfying / K = I, supp(K) C [—1/2,1/2], 


IJKIloo < OO, 


3Tk > 0; 


|K(a;)-K(?/)|<LK|a;-?/| Va;, 2 /eR. 


(7) 
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For all I € Id, h G (0, l]'^ and x G put also 
K^^'>ixi):=l[K{x,), Vhr-=l[h,, 

i£l i£l 

n 

{xo,i) ■=n-^Yl ^hi “ ^ 0 ,/)■ 

i^l 

Then introduce the family of estimators 

■■=\flhU^xo) = n fil\xoj), {h,r) G (0,1]'^ X 

^ lev 

Note first that = fj^\xo) is the Parzen-Rosenblatt estimator (see, e.g., 

Rosenblatt [37], Parzen [35]) with kernel and multibandwidth h. 

Next, the introduction of the estimator is based on the following simple 

observation. If there exists V G fP(/), the idea is to estimate separately each marginal 
density corresponding to I GV. Since the estimated density possesses the product struc¬ 
ture, we seek its estimator in the same form. 

Below we propose a data driven selection from the family 5^[fP]. 


Ki^,\xi):=V-^l[KixJh,); 

i^I 


2.2. Auxiliary estimators and extra parameters 


To define our selection rule, we need to introduce some notation and quantities. 
Auxiliary estimators. For I Gid and ft. G (0,1]'* put 


Ghiixoj) := 1 V 






Introduce for I Gid and h,r] G (0,1]'* auxiliary estimators 

n 

fhl]rir i^oj) ■■= n-^ hi V 77/ := (ft, V 77 ,) 


iei- 


Note that the idea to use such auxiliary estimators, defined with the multibandwidth ft V 
77 , appeared for the first time in Kerkyacharian, Lepski and Picard [24], in the framework 
of the Gaussian white noise model. 

We endow the set fp with the operation “o” introduced in Lepski [29]: for any V ,'P' G'^ 
-poV' :={lnl' ^0,lGV,r G V'} G *p. 

Then we define for h,T] G (0,1]'* and V,V' G'^ 

levov 


( 8 ) 
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Set of parameters. Our selection rule consists in choosing an estimator when 

the parameter {h^V) belongs at most to the set defined as follows. 

Let 3 > 0, r(s) € (0,1], s = 1,... ,(i, be fixed numbers and let \)^P € (0, I € Id, 
be fixed multiband widths. All these parameters will be chosen in accordance with our 
procedure. 

Set also A := sup/gj^jl V A|^|^^[K, 3 ]} and a := {2Ai/r+^}“^, where constants 

Ai^^[K,3],s € N*,g> 1, are given in Section 5.1. The explicit expressions of Ai'^^[K, 3 ] 
are too cumbersome and it is not convenient for us to present them right now. 

For all I G Id and all integer m > 0 introduce 




PPi ■■= {hi G (0, < Vn, < vP_,V^ip}n n 

^ ^ iGl 

SjP2--={hiG{0,lp-. 

/M„(/) \ /M„(/) \ 

U U U "’ST 




m—1 


m—1 


where vP := 2 , M„(/) is the largest integer satisfying [V^(n A t4iax] > 

and Mn{I) < log 2 (n), and Vmax is defined below. 

Define finally 


:= {{h,V) G (0,1]'^ x^: hi G e V}. 

Extra parameters. Let S) and fp be arbitrary subsets of (0,1]'* and fp, respectively. 
The selection rule (9)-(10) below run over l 3 [fp] := (Id x *p) ni 3 [*P] and the reasons 
for introducing these extra parameters are discussed in Remark 1. In particular, for 
measurability reasons, we will always suppose that Si is either a compact or a finite 
subset of (0, l]*^. 

Set A„(xo) ■.= Z\df\lGn{xo)p~^, where 


Gn{xo):= sup_ sup_sup [2Gh,'Ir„{xo,l)]- 

(h,V)&S)p(3] {ri,V')&S)p(3] I&VoV 


Put also Kiax := suppg^inf/gTi^ lA (j) and, for {h,V) G (0,1]'^ x *p, 

Pm ax 


5{h,V)-.= svip sup 

v /n/^€'Pop^ 




^/n/' 


inf/g-p lA (J) 


Define finally, for (/i,P) € (0,1]'^ x *p, 
hl{h,v){xo) := 


[G„(xo)]2{lVln5(/i,P)} 


nV{h,V) 


V{h,V) := inf P?.,. 
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2.3. Selection rule 

For (/i,P) G (0, l]'^ X *13 introduce 

^ ^ (9) 

:= sup_(^p,)(xo) -^^^p,)(xo)| - A„(xo){W(^,P')(a;o) +%.-p)(xo)}] + . 

Define finally {h,V) satisfying 

^(h V)i^o) + 2An{xo)Lift sAxo) = inf_ [A(^h.V){xo) + 2An{xo)Urh,V){xo)]. (10) 

^ ’ '' ^ ’ •' {h,r)GSjm 

The selected estimator is fn{xo) ■= 

[n^ H) 

Similarly to Section 2.1 in Lepski [29] it is easy to show that (h,V) is X^^^nieasurable 
and that {h,V) € 35[113]. It follows that fn(xo) is also a X("i-measurable random variable. 

Remark 1. The necessity to introduce the extra parameters io and Cp is dictated by 
several reasons. The first one is computational namely the computation of A(^ •p)(a;o) and 
(h,V). However, the computational aspects of the choice of *P and Sj are quite different. 
Typically, can be chosen as an appropriate grid in (0,1]*^, for instance, dyadic one, that 
is sufficient for proving adaptive properties of the proposed estimator. The choice of *P 
is much more delicate. The reason of considering *p instead of *P is explained by the fact 
that the cardinality of *13 grows exponentially with the dimension d. Therefore, if *13 = *13, 
for large values of d our procedure is not practically feasible in view of huge amount 
of comparisons to be done. In the latter case, the interest of our result is theoretical. 
Note also that the best attainable trade-off between approximation and stochastic errors 
depends heavily on both the number of observations and the effective dimension d{f) = 
inf-pg(p(j) sup/gp |/|. Thus, if d{f) is big the corresponding independence structure does 
not bring a real improvement of the estimation accuracy. So, in practice, *1? is chosen 
to satisfy supjg^ |/| < do, VP G *1? \ {0}- The choice of the parameter do (made by 
a statistician) is based on the compromised between the sample size n, the desirable 
quality of estimation and the number of computations. For instance, one can consider 
do = 1, that means that *13 contains two elements, {{1,...,d}} and {{l},...,{d}}. The 
latter case corresponds to the observations having independent components and it can 
be illustrated in Example 1 below. On the other hand, in the case of low dimension d, 
one can always take *1? = *1?, since if d = 2, |*13| = 2, d = 3, |*13| = 5, d = 4, |*13| = 12, etc. 

Other reasons are related to the possibility to consider various problems arising in the 
framework of minimax and minimax adaptive estimation and they will be discussed in 
detail in Sections 3.2 and 3.3.2. Here, we only mention that the choice fp = {0} allows 
to study the adaptive estimation of a multivariate density on without taking into 
account eventual independence structure. We would like to emphasize that the latter 
problem was not studied in the literature. 
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At last the introduction of Cp allows to minimize the assumptions imposed on the 
density to be estimated. In particular, the oracle inequality corresponding to = {0} is 
proved over the set of bounded densities; see Corollary 1. 


In spite of the fact that the construction of the proposed procedure does not require 
any condition on the density /, the following assumption will be used for computing its 
risk: 

sup sup ||//||oo < f, £‘P(/) n‘pj, 0 < f <+oo. (II) 
^ ^'P'I^VoV' ^ 


Note that the considered class of densities is determined by *P and in particular, if 0 G *p, 
F,[f,*p] = |/: sup |l/7||oo<f}cF4f,^], 

/Gid 

F4f,{0}] = {/: ||/|U<f}, F,[f,{iP}] = {/: sup ||/z||oo < f,P G ^(/)}. 


2.4. Oracle-type inequality 

For I Gid and (h, p) G (0, l]'^ x [0, l]'^ introduce 

l3hi,r,i{xo,i) ■= / K^^\u)[fi{xoj + {hiV r]i)u) - fi{xoj+ r]iu)]du, 
jRiri 

where here and later yixj denotes the coordinate-vise product of yi,xi G 

For {h,V) G (0,1]"* xtp define S(,j_-p)(xo) := supp,g^supjgpo-p' sup^g[o_i]d \Bhj,fj,{xoj)\. 
Introduce finally, if exists P G fp(/) fl fp, 


«n(/) : = 


mf 

ih,r)es^m: ■PG'Pl/) 


^(h,V){xo) 


ly\n5{h,V) 

nV{h,r) 


The quantity 9I„(/) can be viewed as the optimal trade-off between approximation and 
stochastic errors provided by estimators involved in the selection rule. 


Theorem 1. LetSj C (0,1]'^ andfp C tp 6e arbitrary subsets such that is non-empty. 
Then for any 0 < f < -|-oo, any q>l and any integer n > 3; 

+ V/GFd[f,^], (12) 

where ai := Q:i(g,d,K, f) and 02 := <^2(9,^,K,f) are given in the proof of the theorem. 

Considering the case *P = { 0 } and noting ^ we come to the following conse¬ 

quence of Theorem 1. 
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Corollary 1. Let assumptions of Theorem 1 be fulfilled. Then, for all densities f such 
that ll/lloo < f, 


[/«:/]<«! inf 


sup \Bh,r,ixo)\ 


'ivin(n/y,,) 


nVh 


■a2[nVif\ 


- 1/2 


(13) 


Looking at the assertion of Theorem 1 and its Corollary 1 it is not clear what can 
be gained by taking into account eventual independence structure. This issue will be 
scrutinized in Section 3, but some conclusions can be deduced directly from the latter 
results. Consider the following example. 


Example 1. For any t G R, put 

f{t) = + (| “ 2t)l(i/8_i/4](t) + 11(1/4,3/4] (f) + (1 - 01(3/4,1] (f)}, 

and define fd{x) = IliLi ^ ^ ^ i® easily seen that fd is a probability density 

and the goal is to estimate /(xq), xq G (3/8, T/S)"^. 

Choose t) = (1,..., 1), h = (1/4,... ,1/4) and let Sj = {(), h}. Put Pi = {{1,..., d}}, 
P 2 = {{!}, •. •, {d}} and let fp = {Pi,P 2 }- Since, in this case, x Cp contains 4 elements, 
our estimator can be computed in a reasonable time. 

Moreover, in accordance with the oracle-type inequality proved in Theorem 1, the 
accuracy provided by the selected estimator is proportional to a/[4 ln(4)]/n. On the 
other hand, the pointwise risk of the kernel estimator with optimally chosen bandwidth 
and kernel is proportional to ln(4)]/n if the independence structure is not taken 

into account. As we see, the adaptation to eventual independence structure can lead to 
significant improvement of the constant. This shows that the proposed methodology has 
an interest beyond derivation of minimax rates, which is the subject of the next section. 


3. Minimax and adaptive minimax pointwise 
estimation 

In this section, we provide with mininiax and adaptive minimax estimation over a scale 
of anisotropic Nikolskii classes. 

3.1. Anisotropic Nikolskii densities classes with independence 
strnctnre 

Let {ei,..., Cg} denote the canonical basis in R®, s G N*. 

Definition 1. Let p = {pi,... ,ps),pi G [l,oo], /3 = (/3i,... ,/3g),/3i > 0 and L = {Li, 
...,Ls), Li > 0. A function /:R® —^ R belongs to the anisotropic Nikolskii class 
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Np,s(/3,i) if 

(i) Vfe = 0:W,Vz = M; 

(ii) VteK,V* = M. 

Here, D^f denotes the kth order partial derivate of f with respect to the variable ti, and 
\_j3i\ is the largest integer strictly less than Pi. 

The following collection {A^*^(/3,L,P)}-p was introduced in Lepski [29] in order to take 
into account the smoothness of the underlying density and its eventual independence 
structure simultaneously. 


L,V)-.= U& N;,(/3,L): / > 0, / / = 1, f{x) = € K'' 

where / G N* ^{P, L) means that 

//GNp,.p|(/3/,L/) V/Gid. (14) 

We remark that this collection of functional classes was used in the case of adaptive 
estimation, that is, when the partition P G is unknown. However, when the minimax 
estimation is considered {V is fixed), we do not need that condition (14) holds for any 
I G Id- It suffices to consider only I belonging to V, and we come to the following 
definition. 

Definition 2 (Minimax estimation) . Letp = (pi, ... ,pd),Pi G [1, oo], /3 = {Pi, ..., Pd), 
Pi> 0, L = {Li,..., Ld), Li>0 and V G'^. A probability density /: —>■ R+ belongs to 
the class Np^d{P,L,'P) if 

f{x)=\{li{xi) VxGM'', //GNp,.p|(^/,L/) V/GP. (15) 

I&V 

Let us now come back to the adaptive estimation. As it was discussed in Remark 1, 
the adaptation is not necessarily considered with respect to If C fp is used instead 
of fp, the assumption (14) is too restrictive and can be weakened in the following way. 
Denote^* :={PoiP': P, P'G W and := {/G1^: 3VG^*,IgV}. 

Definition 3 (Adaptive estimation) . Let fp C ^p and {P,p,V) G (0, +oo)'^ x [1, 00 ]“^ x 
'P be fixed. A probability density /:R'^—?^R+ belongs to the class Np^diP, L,^) if 

f{x)=l[fi{xT) VxGR'"; fiGNp,^\p{Pi,Li) gT^. 

lev 


( 16 ) 
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Some remarks are in order. 

(1) We note that if Cp = then Np^diP,L,V) = N* ^{P,L,V), but for some fp C fp, 
one has N*^^{j3,L,V) C Np^d{f3,L,'P). The latter inclusion shows that the condition (16) 
is weaker than / G N *In particular, if tp = {0}, then Np^d{P,L,'0) = {/ g 
npAP,Ly />o,// = i}Div;,(^^,0). 

(2) Note that if *P = {P}, then Np^dil3,L,V) coincides with the class Np^di/3,L,V) 
used for minimax estimation. But Np^dif3,L,'P) C Np^d{P,L,V) for all 7^ G tp for any 
other choices of *p. 

3.2. Minimax results 

For G (0,+oo)‘^ x [l,oo]^ x *p define 

r:=r(/3,p,T’) = inf 7/(/3 ,p), ■= , IgV; 

ipn{l3,P,'P):=\-\ , p„(/3,p,T’) := l{r<0}+<Pn(/3,P,^)l{r->0}- (17) 

As it will follow from Theorems 2 and 3 below pn^P^PiV) is the minimax rate of con¬ 
vergence on Np^diP, L,V). Hence, similarly to the standard representation of minimax 
rates, the parameter r can be interpreted as a smoothness index corresponding to the 
independence structure. 

Theorem 2. y{P,p, V) G (0, -l-oo)'^ x [1, oo]'^ x tp, VL G (0, coY, 3c>0.- 
liminf|p■l(^,p,T’)inf7^i‘?^ [/„, Ap,d(/3,L,T’)]} > c, 

n^+oo ^ ^ 

where inflmum is taken over all possible estimators. 

Note that the assertion of Theorem 2 will be deduced from more general result es¬ 
tablished in Proposition 1 below. It is also important to emphasize that if r < 0 there 
is no uniformly consistent estimator for the considered problem and, to the best of our 
knowledge, this fact was not known before. Let us provide an example with a density for 
which r < 0. 

Example 2. Suppose that d = 1 and, therefore, P = 0 (no independence structure). For 
any a; G K, put 

g{x) = l{o}(a;) -b ^^l(o,i](x). 

Some straightforward computations allows us to assert that g ^ Np^i{/3, L, 0), VL > 0, if 
J5/3 > I (i.e., r > 0), and that g G W,i(I/2,T,^ for some L > 0 (p = 1, P = 1/2). Thus, 
in this case, one has r < 0. 
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Our goal now is to show that (pri{l3,p,'P) is the minimax rate of convergence on 
Np^d{P,L,V) and that a minimax estimator belongs to the collection In fact, we 

prove that the minimax estimator is ^i^h properly chosen kernel K and band¬ 

width h. 

For a given integer I > 2 and a given symmetric Lipschitz function m : R —>■ R satisfying 
supp(u) C [-1/(2/), 1/(2/)] and J^u{y)dy=l set 

^eR. (18) 

Furthermore, we use K = rt; in the definition of estimators collection 

The relation of kernel ui to anisotropic Nikolskii classes is discussed in Kerkyacharian, 
Lepski and Picard [26]. In particular, it was shown that 

[ K{z)dz = l, [ z^'K{z)dz = d V/c = 1,...,/-1. (19) 

Choose finally h = (hi,..., h^), where 

ll- = ii-(7/(^,p)/(27/(^,p)-l-l))(l//3i(J'))^ I ^ I I 

Here, 

/3.(/) :=x(/)A^-'(/), ^(/) :=l-^(/3fePfc)-\ k,{I) := 1 -^(p^^-p-i)/37^ 

fee/ fee/ 

Theorem 3. For all (/3,p, P) G (0, /]‘^ x [1, oo]'^ x such that r{j3,p, V)>d and all L G 
(0, oo)'^ 

limsup{^p7^(/3,p,T’)7^^/)[/^((L,/Vp,d(/3,L,T’)]} < oo. 

n—>-+oo 

To get the statement of this theorem, we apply Theorem 1 with fp = {V} and Sj = {h}. 
In view of the embedding theorem for anisotropic Nikolskii classes (formulated in the 
proof of Lemma 3 and available when r(/3,p, V) > 0), there exists a number f := f(/3,p) > 
0 such that Np^d{l3,L,'P) C F(i[f,{7^}]. It makes possible the application of Theorem 1. 

Let us briefly discuss several consequences of Theorems 2 and 3. First, if P = 0, we 
obtain the minimax rate on the anisotropic Nikolskii class Np^difd-, L). In particular, if 
Pi = -boo, / = !,(/, we find the minimax rate on the anisotropic Holder class Ild{P,L) 
given in (3). If d = 1, then our results coincide with those presented in (2). 

Next, in view of Theorem 2 there is no consistent estimator for /(ccq) on Np^d(/3,L) 
if r(/3,p,0) < 0. On the other hand, if / G Np^diP, L,V) and r(P,p,V) > 0, then such 
estimator for f{xo) does exist in view of Theorem 3 even if r{j3,p,0) < 0. 

Note also that the condition r{P,p,0) > 0 is sufficient to find a consistent estimator 
on each functional class Np^dild, V and that the same condition is neces¬ 

sary for the estimation over Np^d{P,L,0). It allows us to compare the influence of the 
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independence structure on the accuracy of estimation. For example, we see that 

» ipnil3,p,V), Pi = oo,i = l,d- 

We conclude that the existence of an independence structure improves significantly 
the accuracy of estimation. 

We finish this section with the result being a refinement of Theorem 2. 

Proposition 1. V(/3,p, V) € (0, + 00 )^^ x [1, oo]'^ x tp, Vi £ (0, oo)'^, 3c>0; 

[fn,N* j^{f3,L,'P)]^ > c, 
fn 

where infimum is taken over all possible estimators. 

Remark 2. Recall (see Section 3.1) that N* j^{l3,L,V) C Np^d{j3,L,V) C Np^d{j3,L,V). 
Hence, the statement of Theorem 3 remains true if one replaces Np^d{P,L,V) by 
Np^diP^L.T), *P C *p. Thus, Proposition 1 together with Theorem 3 allows us to as¬ 
sert that pn{l3,p,V) is the minimax rate of convergence on Np^diP.L.V). 


3.3. Adaptive estimation 

3.3.1. Adaptive estimation. Upper bound 

Let tp C tp, such that 0 £ *p, be fixed. Denote d{V) := sup/g-p |/|, V € *P, and d := 
infpg^dCP). 

Set > (d — d)/2, = -|-oo,i = l,d, and suppose additionally that 

l>2y /3max- Choose K = ui, 3 := — and r(s), s = 1,..., d, satisfying 

^Pmax ' ' 

t{s):=2/3 

max /(2/? max H“ ■ 

Let be the dyadic grid in (0,1]^ and let I £ Id, be the projection on the dyadic 
grid in (0, l]l^l of the multibandwith hj^^ given by 

Consider the estimator fn{xo) defined by the selection rule (9)-(10), in Section 2.3. 
For {f3,p,V) £ (0,/3max]'^ X [ 1 , 00 ]'^ X <p introduce 

r :=r(/3,p,P) <rmax, „ 


r ■.= r{j3,p,V) =rmax, 


•0n(/3,p,P) := 



'G max / (27* max“l”l) 


n 


( 21 ) 
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Theorem 4. For any {/3,p) € (0,/3max]'^ x [l,oo]‘^ sueh that r{P,p,0) > 0, any P G 
and any L G (0, oo)'^ 

limsupiip-^/^,p,Ifn, Np^d(/3,L,r)]} < oo. 

n—)-+oo 


Similarly to Theorem 3, the proof of Theorem 4 is mostly based on the result of 
Theorem 1. The application of Theorem 1 is possible because Np^diP, L^V) C Fd[f,tp] 
for some f ;= f(/3,p) > 0 that is guaranteed by the condition r{l3,p,'0) > 0. 

We would like to emphasize that the construction of /^(xo) does not involved the 
knowledge of the parameters (/3,L,p,P). Using the modern statistical language, one can 
say that /n(a^o) is fully adaptative. 

Note, however, that the precision '!/)„(/3,p, P) given by this estimator does not coincide 
with minimax rate of convergence (/?„(/3,p,P) whenever r ^ rmax- In tbe next section, we 
prove that '4’n{l3,p,'P) found in Theorem 4 is an optimal payment for adaptation. 

3.3.2. Adaptive estimation. Criterion of optimality 

Let {a,b) G ^ x *8} be the scale of functional classes where A C is a (m)- 

dimensional manifold and is a finite set. Recall that the family W = {!fVi(a,5), {a,b) G 
X fB} of normalizations is called admissible if there exists an estimator such that 

limsup{|^'■^(a,6)7^^«)[/vl,,S(„_{,)]} <+00 V(a,5)G^x«B. (22) 

n—)-+oo 


The estimator is called tf'-adaptive. 

In the considered problem, a= {(3, p), b = 'P and 

"4= {(/3,p) G (0,/3inax]'* X [1,00]'^: r(^,p,0)>O}, *8=‘p. 

As it follows from Theorem 4 ijjn{l3,p^V) is an admissible family of normalizations and 
the estimator /„ is ^/^„-adaptive. 

Let W = {Pn{o:, b) > 0, (a, 6) G ^ x *8} and W = b) > 0, (a, 6) G ^ x *8} be arbi¬ 

trary families of normalizations and put 

T„(a):= inf T„(a,6). 

Wn{a,b) beT. 

Define the set W1^] U „4 as follows: 

:=|ag^; limT„(a)=o|. 

I n^oo J 

The set ['?'/'?'] can be viewed as the set where the family tp “outperforms” the family 
P. For any 6 g 18, introduce 

■.= [aGA. J^T„(ao)T„(a,6) =oo,Vao 
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Remark first that the set ['?'/'?'] is the set where the family ^ “outperforms” the 

family Moreover, the “gain” provided by with respect to on ['f'/S'] is much 
larger than its “loss” on /'!/]. 

The idea led to the criterion of optimality formulated below is to say that 'R is “better” 
than 'P if there exists 6 S for which the set is much more “massive” than 

Definition 4- (I) ^ family of normalizations W is called adaptive rate of convergenee 

if 

1. ^ is an admissible family of normalizations; 

2. for any admissible family of normalizations W satisfying A^^'^ ^ 0 

• A^^^^Z'R] is contained in a (m — 1)-dimensional manifold, 

• there exists 18 such that contains an open set of A. 

(II) IfW is an adaptive rate of convergenee, then satisfying (22) is called rate adaptive 
estimator. 


The aforementioned definition is inspired by Klutchnikoff’s criterion; see Klutchnikoff 
[27]. Indeed if card(18) = 1 the both definitions coincide. 

Theorem 5. (i) We can find no optimal rate adaptive estimator (satisfying (5) in Sec¬ 
tion 1 ) over the scale 

whenever 21 C {(/3,p,L,P) € (0,/3niax]'^ x [l,oo]'* x (0,oo)'^ x Cp: r{(3,p,'0) > 0} contains 
at least two elements {(3,p,L,T’) and {[3',p\ L',V') such that r{(3,p,'P) ^ r{[3',p',V'). 

(ii) fn{xo) is rate adaptive estimator of f(xo) and ipn is the adaptive rate of conver¬ 
gence, in the sense of Definition 4, over the scale 

{NpA/3,L,V),iP,P,L,V) e (0,/3„,ax]'' X [l,oo]‘' x (0,oo)‘' x ^,r(/3,p,0) > 0}. 


It is important to emphasize that our results cover a large class of problems in the 
framework of pointwise density estimation. 

In particular, if ip = {0}, we deduce that fnixo) is rate adaptive estimator of f{xo) 
over 

{^p.d(/3,T,0),(/?,p,T) e (0,/3max]‘' X [1,00]^^ X (O,oo)‘^,r(/3,p,0) >0}. 

The adaptive rate of convergence for this problem is given by 


'0„(/3,_p,0) := 



r'/(2r+l) 

? 


L/(2rmax + l) 

5 

(/3,p) = 




d ■ 
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To the best of our knowledge, the latter result is new. It is precise and generalizes the 
results of Butucea [4] (d = 1) and Comte and Lacour [ 6 ] for the deconvolution model 
when the noise variable is equal to zero. 

Another interesting fact is related to the set of “nuisance” parameters where the adap¬ 
tive rate of convergence ^„(/3,p,P) coincides with the minimax one. In all known for 
us problems of pointwise adaptive estimation this set contains a single element. How¬ 
ever, as it follows from Theorem 5, this set may contain several elements. Indeed, if, for 
instance, d = 4 and =jT’i,T’ 2 ,T’ 3 } with Vi = {{!}, {2}, {3,4}}, P 2 = {{1,2}, {3,4}}, 
P 3 = {{ 1 , 2 ,3,4}}, then fn{xo) is rate adaptive estimator of f{xo) over 

{Np,ii/3,L,P),iP,p,L,P) € (0,/3 J X [l,oof x ( 0 , 00 )^ x ^,r(^,p,0) > 0}. 

In this case, the adaptive rate of convergence satisfies 

/ 1 \ T’max / ( 2 'rmax“l“ 1) 

:= f- j 

__ /^max 

^max ■— ^ • 

Thus, in the considered example the aforementioned set contains two elements. 

Finally, let us note that there is a “In-price” to pay for adaptation with respect to the 
structure of independence even if the smoothness parameters /3, L and p are known. This 
result follows from the bound (41) established in the proof of Theorem 5. 

4. Discussion: Comparison with the global method in 
Lepski [29] 

The latter paper deals with the rate optimal adaptive estimation of a probability density 
under sup-norm loss. It is obvious that the estimator constructed in Lepski [29] is fully 
data-driven and can be also used in pointwise estimation. However, this estimator is nei¬ 
ther minimax nor optimally minimax adaptive when pointwise estimation is considered. 
Below, we discuss this issue in detail. 

Oraele approach. Obviously, the use of a local method allows to control better the 
error of approximation since Bf^h,'P)ixo) is smaller than sup^g^d Moreover, our 

local method controls better the stochastic error since \nS{h,V) is smaller than ln(n). 
The latter fact is explained by the use of different constructions of the selection rule. 
First, it concerns the choice of the regularization parameter h. Whereas Lepski [29] uses 
kernel convolution, we use the “operation” V on the set of bandwidth parameters. Next, 
in pointwise estimation, we select the parameter {h^V) from very special set whose con¬ 
struction is new. It is important to emphasize that the consideration of the parameter 
set used in Lepski [29] is too “rough” in order to bring an optimal pointwise adaptive 
estimator. Both reasons required the introduction of novel technical arguments for point- 
wise estimation with respect to those in Lepski [29] for estimation under sup-norm loss; 


W,P,V) e {/?(““)} X X {Vi,V2}, 



18 


G. Rebelles 


see the definition of our selection rule in Section 2.3, and the proofs of Proposition 2, 
Lemma 1 and Theorem 1 in the next section. Note, however, that the adaptation to 
eventual independence structure in both papers has rest upon the same methodology. 

The following example illustrates clearly how the quality of estimation provided by 
Lepski’s estimator can be significantly improved by application of our local method. 

Example 3. Considering the problem described in Example 1, we compare both meth¬ 
ods. 

• Local method. We obtain from our local oracle inequality that 

(E^"'^|7„(a;o) -< (ai\/41n(4) -b 02 )^“^/^, ai,a 2 >0. 

• Global method. The best quality of estimation provided by Theorem 1 in Lepski [29] 
is 

- /d(xo)r)^/" < {2Ci+C2){.n/Hn))-^/\ Ci,C2 > 0. 

It is also important to emphasize that our Theorem 1 presents other advantages with 
respect to that in Lepski [29]. 

(a) We derive our oracle-type inequality over the functional class F(i[f,*r] which con¬ 
tains the class Fd[f] used in Lepski [29] that allows to obtain upper bounds under more 
general assumptions. For instance, if fp = {{1,..., d}}, we do not need that all marginals 
are uniformly bounded, that is not true when we use Theorem 1 in Lepski [29]; see our 
Corollary 1 above. 

(b) The oracle-type inequality for sup-norm risk cannot be used in general for other 
type of loss functions. Contrary to this, the pointwise risk can be integrated that allows 
to obtain the results under Lp-loss; see, for example, Lepski, Mammen and Spokoiny 
[30] and Goldenshluger and Lepski [17]. In this context, the establishing of local oracle 
inequality with the term lni5(/i,P) instead of ln(n) is crucial. 

Minimax adaptive estimation. Comparing the minimax rate of convergence defined by 
(17), we find a price to pay for adaptation in the pointwise setting. This does not exist 
in the estimation under sup-norm loss. Note nevertheless that this price to pay for adap¬ 
tation is not unavoidable for all values of nuisance parameter (/3,p, L,7^). This explains 
the necessity of the introduction of the optimality criterion presented in Section 3.3.2. 

Let us also compare our results with those obtained in Lepski [29]. 

Example 4- Consider that Cp still contains the elements Vi and 7^2 defined in Example 1 
and that d = 2. Put /3max = 1- 

• Local method. In view of our results, our estimator /„ achieves the following minimax 
rate of convergence: 

inf _ sup 
/ /eAfoo,2(/3(“‘**)T.'P2) 

where infimum is taken over all possible estimators. 
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• Global method. In view of the results in Lepski [29], the estimator /„ proposed in 
the latter paper achieves the following minimax rate of convergence: 

mf _ sup (E^”^||/-/||^)^/‘^x(r^/ln(r^))■^/^ 

f /eAfoo,2(/3('"“),L.P2) 

where infimum is taken over all possible estimators. 

Thus, the application of the procedure from Lepski [29] for pointwise adaptive esti¬ 
mation leads to the logarithmic loss of accuracy everywhere, while our estimator is rate 
optimal for some values of nuisance parameter. 


5. Proofs of main results 

The main technical tools used in the derivation of pointwise oracle inequality given in 
Theorem 1 are uniform bounds of empirical processes. We start this section with present¬ 
ing of corresponding results those proof are postponed to the Appendix. In particular, we 
provide with the explicit expression of the constants ]K, 3 ], g > 1 , used in the selection 
rule (9)-(10). Our considerations here are mostly based on the results recently developed 
in Lepski [28]. 


5.1. Constants A^'^^[K, 3 ] 


Set for any s G N*,g > 1, Ai'^^[K, 3 ] := {3q + sg]! V 3](1 -I- l/ 2 :)}^/^Ai'^\ where r := 
inf/gi^rd/]) >0, 


Ai^) :=Ai«)[K] 


^lOse® -I- 


10seLK\ 


V (48e) 


77 + 7^(1 + 9)11X11^ 


d^iKii 


S 

00 


and Cl'^l := [lUs5-‘^ + 5q + 3 + SbCd V 1. 

Here, <5* is the smallest solution of the equation 87 r^( 5 (l + [In<5]^) = 1 and 


Cs ■■= s sup — 
<5><5. 0^ 


1 + In 


/9216(s+1)(52 


S sup --r 

S>S, 0 


1 + In 


/9216(s + l)5 




►(<5) 


J + 


s*{S) 


(6A^) 

1 + [In 5] 2 


5.2. Pointwise nniform bounds of kernel-type empirical processes 

Let s G N*,s < d, and let Yi = (lip,... ,Yi^s)p G N*, be a sequence of M^-valued i.i.d. 
random vectors defined on a complete probability space (H, 21, P) and having the den¬ 
sity g with respect to the Lebesgue measure. Later on denotes the probability law 
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of y(") := (Fi,...,y„) and is the mathematical expectation with respect to 
Assume that ||g||oo < g where g > 0 is a given number. 

Set ai‘^^ := (2-y^l + (7[1 V and 


])("■) 

a ■ 




2=1 


:= {h €: nVh>[a‘f'>] ^ln(n)}. 


(s') 

For any h G Hn , yo € K'* and u > 1 set also 

s s s 

K{y):=l[Kiy,), V^:=l[h., K^{y) ■.= Vf;^l[Kiy,/h,) Vy G1 


Ghivo) := 1 V 


\Khiy-yo)\9iy)dy 


Ghivo) := 1 V 


n ^^|A:/.(F*-yo)| 


2 = 1 


M := W 11 V In f I + u 


nVh 


Vh 


For a given yo G K® consider the empirical processes 

n 

:= n-1 - yo) -E("){iG,(r, - yo)}], h G 

2=1 
n 

elr^(yo) := - yo)| - EW{|if^(r, - yo)|}], h G 22^. 


i=l 


Proposition 2. For all y > 1, all integer n > 3 and all number u satisfying 1 < u < 
y ln(n) 

(i) E(-){ sup [|4’^)(yo)|-A(^)wi“^(yo)] + }'<Ci«)(K,g)[nF,(_)]-«/2e-“; 


(ii) E(")<| sup 


\^h\yo)\-lGn{yo) 


<C(^)(K,g)[nK(_)]-«/2g-.. 


J +> 

9\ 1/9 


(eM{ sup [Gn{yo)-2GM] + y) 




The expression of the constant C. 


( 9 ) 


(K,g) is given in the proof of the proposition. 
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5.3. Oracle-type inequality 

5.3.1. Auxiliary result 


For I Gid and h G (0, set 

bh,ixQj):= [ -xoj)fiixi)dxi, := fi'^\xoj) - bn^ixoj)] 

JRIJ'I 

Ghiixoj) :=1W [ \kI^\xi - xoj)\fixi)dxi , 

Jk\!\ 


G{xo):= sup_ sui^_ sup Gh,\/ni{xo,i)- 

ih,v)&s)p(3] {v,v')es)[<:(3]ieVor' 


For any {h,V) G (0, l]'^ x tp put 


U, 


{h,V 


){xo) := ' 


'[G{xoW{lV\nS{h,V)} 


nV{KV) 


Define also f„(xo) := 12Ad^(2max{G„(a;o), 1 V f||K||f})‘^^ and 

^„(a;o):= su^_ sup_sup [|Ci"vr„(^o./)| - A{G(,j,p)(xo)+G(^,-P')(xo)}] + . 

(/i,P)efl[<p] (rj,-p')efl[‘P] -fePo-p' 


Lemma 1. Set f > 0. For any q>l there exist constants c,; := Ci(2(jf, d, K, f,3), i = 
1,2,3,4, such that'inPZ, V/GFd[f,^], 'i{h,V) VG^{f), 

(i) < ci[nyi„ax]“^^^; 

(ii) (E^"^[G(a;o) - G„(a:o)]+< C2[nVmax]“^^^; 

(iii) < cg; 

(iv) {E‘'P^\U(^h,v)ixo)f‘^f^^'^ < C4l((^h,v)ixo)- 


5.3.2. Proof of Theorem 1 
We divide the proof into several steps. 

(1) Let {h^V) G V G iP(/), be fixed. By the triangle inequality, we have 


\fn{xo) - f{xo)\ < l/^£~)(a:o) - + 1-^' 


rt") 

{h,V),(h,V) 


- f\h]v)‘y^o)\ 


+ l-^r,p)(2^o)-/(a;o)| 


< 2[A(,,_-p)(xo) +2A„(xo)G(/t,p)(xo)] + |.^r.p)(^o) - /(a::o)|- 


(23) 
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Here, we have used that , ,—,(a^o) = , ,( 2 ^ 0 ) and the definition of (h,V). 

’ ■’{h,'P),{h,'Py ' ■' {h,'P),{h,'Py ' V I y 

In what follows, we will use the inequality: for to G N* and a^, 6 ^ G K., * = 1, to, 


Oi - 


< TO 


sup niax{|ai 




sup ja^ 

i—l,m 


-h 


(24) 


Here and later, we assume that the product and the supremum over empty set are equal 
to one and zero, respectively. 

(2) Since V €^{f), using (24) we have 

\f}h'*r)(-^o) - f{xQ)\ < d(supmax{G/i,(xo,/),f}) sup|^”^(a:o,/) - //(a:o,/)| 

(25) 

< d(max{G„(a;o),f})‘^ ^[B(h,v){xo) + ^n{xo) + 2AW(/,_-p)(a:o)], 

since G„(a;o) > Ghi{xoj) > 1 and |/^"^(a;o,/) - //(a:o,/)| < ld"^(^o,/)| + \bh,ixo,i) - 

fiixoj)\,yi€r. 

(3) Set := c?[G„(xo)]‘^^'^“^^. For any {r],V') G we get from the inequality (24) 


I (xn) — 


(2:0)1 < 


sup 

I'ev' 


n 


'^n) 
JI 


^inl' ^Vinl 


,{xo,inp)-ft}{xoj') 


lev: irii'^0 


Introduce, for all / €ld and all 77 G (0,1]'^, bh,,ni{xo,i) := /g|/| x^j) fi{u) du. 

Put also := d(max{G„(a;Q),G(a:o)})'^“^. For any ( 77 ,P') G and any /' G V, in 
view of (24), 


n •^”L.W^"'0dn/')- n KnTO'7/n/'(^0,/n/') 
leP: ini'^0 leV: /n/'/0 

( 2 ) _ 


<f. 


n sup \^l’vvin,'^^°dnl')\, 

leV: ICI'^0 


(a;o,/n/') - (2:0,/') 

I^V: lnl '^0 


<{^^^B^d,v){xo). 


For the last inequality, we have used that V G *P(/) and, therefore, for any 77 G (0,1]*^ and 
any I' G Id 


bT,j,ixoj')= K^l'Xxi>-xo,i>) n //n/'(2;/n/')d2;/'= Y\ 

' I^V: lnl'^0 l€V: lr\I'^0 


^Vini' (2^0,70/')■ 
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_( 2 ) 

(4) Applying the triangle inequality, we get since >1 and U(^h,'P){xo) > 0, for any 


< sup sup +'^n'’^{h,v){xo) + 

rev'' lev: inr=^0 ^ 

< + 2fi'^ff en(a:o) + 3Afi'^ff {W(,,pq(xo) +Zi(^,p)(xo)}. 

Put In^ := d[2G„(a:o)]‘'“^ and U{xo) :=sup(^ ,p,jg 3 ^j^jG(,,^-p/)(xo). We obtain that 

^{h,v)i^o) 

< 2fi'’fif^{^(;,,p)(a:o) +en(a:o)} + 3Afi'^{Zi(xo) +%.p)(xo)}[ff 

+ I sup_ [U^^,V) {xo) - U(r,,v) (a:^o)]+ 

^{v,v)&^om 

+ [i^{h,v)ixo) -^(/i.-p)(a;o)]+|; 

^ih,V)ixo) 

<in{xo){B(^h,v)M+^nixo) + [G(xo) - G„(a;o)]_^}, 
where f„(a;o) := 12A(i^(2max{G„(a;o), 1 V f||K||f})‘^^, since A A ||K||i > 1, 


(26) 


u^,,v){xo) < (1V < 1V f||K||f y{p,v') e 


and [a™ — &™]+ < m(max{a, 6})™ ^[a — 6] + , Va, & > 0, Vm G N*. 

(5) Finally, we deduce from (23), (25) and (26), using again A A ||K||i > 1, that 

\fnixo) - f{xo)\ 

(27) 

< 3{n{xo){B(^h,v){xo) +i^ih,v){xo) +U(^h,v){xo) + ^n{xo) + [G(a;o) - G„(a;o)] + }. 


By the Cauchy-Schwarz inequality 

(^f^\fn{xo) - f{xo)n^^‘‘ 

< 3{Ef\{r,ixo)\^‘>f^^'^\B^h,v)ixo) +U^h,v)ixo) + 

+ (E^”)|Cn(xo)|''')'/^'’^ + (E^")[G(xo) - G„(xo)]+')'/^'^^]. 
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Applying Lemma 1, 


i^f'^lfnixo) - < 3C3[B^h,V)ixo) + {1 + C4)U(^h^-p){xo) + (ci + C 2 )[nKnax] 


and we come to the assertion of Theorem 1 with ai = 3c3(l + C 4 )(l V f||K||f) and a 2 


3C3(ci +C 2 ). 

5.4. Lower bound for minimax estimation 

5.4-1. Auxiliary result 

The result formulated in Lemma 2 below is a direct consequence of the general bound 
obtain in Kerkyacharian, Lepski and Picard [25], Proposition 7. 

Let {P,p,V) S (0,oo)‘^ X [ 1 , 00 ]'^ X and L G (0,oo)'^ be hxed. 

Lemma 2. Suppose that there exists {/o,/i} C N*^{(3,L,V) such that is absolutely 
continuous with respect to and 


\fi{xo) - fo{xo)\ > SniP.P.'P)-, 


(28) 


lim sup E 

n —^+00 



(29) 


Then, for all q>l, 



>1{1-VC/{C + A)), 


where infimum is taken over all possible estimators. 


5 . 4 . 2 . Proof of Proposition 1 

Set J\f{x) := nf=i ^ 6xp( —x?/2) and let fo{x) '.= <J ^Af{x/a). It is easily seen that 

one can find cr > 0 such that 



(30) 


Let I = {ii,.. .,im} G P be such that r := r[(i,p,V) = 7/(/3,p) and :R —^ R such that 
supp(g) C (-1/2,1/2), gG Hie/^Pi,i(A, 1/2), /5 = 0, and |5(0)| = ||g||oo- Define 
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where An, Si^n —0, / = 1, m, if n —?► oo, will be chosen later. Note that G G Np^ 
if 


i/pi. 


^nSi n ' ( ^j,r, 

Vi=l 


< 

Cl 


l = l,m,ci = „^„p. 


m —1 


Introduce 


(31) 



It is obvious that there exists > 0 such that if An < then fi{x) > 0 for any x € 
Note also that the condition f g = 0 implies that f fi = 1. We conclude that /i is a 
probability density. Furthermore, assumptions (30)-(31) and the definition of fo allow 
us to assert that fi G N *We remark that 


\fiixo) - fo{xo)\=c*iAn, cj := (crv^)"* ‘^|g( 0 )r]^exp(-xg ^/ 2 cr 2 ). 

if I 

Then Assumption (28) of Lemma 2 is fulfilled when Sn{P,P,P) < c^A„. 

Since X^, k = l,n, are i.i.d. random fields and J 5 = 0 it is easily check that 


E 


(n) 

/o 


dP 


(n) 

/l 


LdP 


(X(")) 


< 


1 + 


< exp 


fojixo,i) 

2\\g\\f^ 

fo,i{xoj) 


C m \ 


11^ 


for n large enough. Here, we have used that supp(G') C n„ := ~ ^i,n/2,xo^i, + 

5i,„/2] and that infa;^gn„ fo,i{xi) > fo,i{xoj)/2 for n large enough. 

( \ 

Since (^^"^)] = 1, Assumption (29) of Lemma 2 is fulfilled if 


exp 


2||3ll 


2m 


fojixQj) 


f ^j,r, 

Vj = l 


-1<C. 


The latter inequality holds if 


nAl < t^, t ■= V^[c^] ^ln(C' + I), 


2||5ll) 


/o,/(a;o./)' 


(33) 
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To finalize our proof, we study separately two cases: r > 0 and r < 0. Note first that 
r = (1 - l/s/)/(l//3/), where 


1 1 1 1 

Si ' /3iPi ’ j3i' ^ Fi^ 


i€l 


(1) Case r > 0. Solving the system 

/ m \ 


\J = 1 


Cl 


I = 1,171, 


nAll^^Sj^nj =t^ 


we obtain 






A^/Pn-ViPnPn)^ A„=rL 


2 \ r/{ 2 r+l) 


R = 


. 1=1 


n(^ 


L- \ V(2/3i,)' 




It is easily seen that An, Si^n —>■ 0, Z = 1, m, if n —>■ oo and one can choose (7 = 1. 

We conclude that, if r > 0, Lemma 2 is applicable with Sn{P,p,V) = 

(2) Case r<0. We choose An = A, where the constant A satisfies 0 < A < Aq. Solving 
the system 



I = l,m. 



bl,n ^ 



^j,n ^ R2H ; 

3 = 1 


ln((7 + l) 
c^712 


Note that one can choose A such that ^ 1 ^'^d (7 = 1. Since s/ < 1, 

—H 

we obtain the following solution: 


ra = I — I “^0; / = 1, m, n —?► 00 . 

\n J 


We conclude that, if r < 0, Lemma 2 is applicable with SniP^P^V) = clA. 
This completes the proof of Proposition 1 . 
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5.5. Upper bounds for minimax and adaptive minimax 
estimation 


The proof of Theorems 3 and 4 is based on application of Theorem 1. Note that in 
view of the embedding theorem for anisotropic Nikolskii classes (formulated in the proof 
of Lemma 3), there exists a number f := f(/3,p) > 0 such that sup/gp ||//||oo < f if 
r{l3,p,V) > 0 or such that sup^g^* sup/g-p ||//||oo < f if r{P,p,^) > 0. It makes possible 
the application of Theorem 1. 


5.5.1. Auxiliary result 

The result formulated in Lemma 3 below is a consequence of Theorem 6.9 in Nikolskii 
[34]. 

Let ^ > 2 be a fixed integer and C ip be a fixed set of partitions of d}. 

Let / e Np^d{f3,L,V), where (3 G (0,Z]‘^, P G fp, p G [ 0 , 00 ]"^ satisfy r{j3,p,V) > 0 and 
LG (0,00)'^.’ 

Lemma 3. There exists c := c{'K,d,p,l,'P) > 0 such that 

Bhuvr M < VP' G V/ G P o P', V(h, 77 ) G (0,1]'" X [0,1]'", 

iGl 

where Bh,,rii{xo,i) is defined in Section 2.4, fi{I) := >c{I)j3iK~^{I), >c{I) := 1 — 
T,keAl^kPk)~^ and k,{I) := 1 - Y^kaAPk^ - ■ 

The proof of this lemma is given in the Appendix. 


5.5.2. Proof of Theorem 3 

For all / G P, consider the following system of equations: 





i,j e I, 


and let h/ denotes its solution. One can easily check that 

h- = p-(7/(^.p)/(277(^,p) + l))(l//3i(J'))^ { ^ JJ 


(34) 


Here, we have used that l/ 7 /(/?,p) = J2iei 

We note that 2“^nF(h,P) > a ^ ln(n) for all n large enough. To get the statement 
of the theorem, we will apply Theorem 1 with 3 = 1, r(s) = 1, s = l,...,d, = h/ if 

/ G P and = 1 if i G /, / ^ P, b = {h}, Cp = {P}. Thus, b[*P] is non-empty for n 
large enough and we get 



/] < ai(cL V 1) 


sup 

./GP 


Eh: 


ft(/) 


iG/ 


-I- sup 

IGV 



+ a 2 sup 
/GP 



(35) 
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where L := sup^^j-li• Here, we have used Lemma 3 and the definition of H(h,p)(xo). 
We deduce from (34) and (35) 

'^n'^lfih'P)’— [2ai(cLVl) + a 2 ] sup = [2Q!i(cLVl) + a 2 ]«~'’'^^^’'''~^^ 

^ ^ lev 

and the assertion of Theorem 3 follows. 


5.5.3. Proof of Theorem 4 

Set (/3,p) € (0,/3max]‘^ X [1,00]"^ such that r{/3,p,0) >0, P G CP, L G ( 0 , 00 )"^, and / G 

Np^aW,L,r). 

Let us first note the following simple fact. If P' G fp and J = ID I', I G V, I' G V, we 
easily prove that (3i{J) > (3i{I) Vi G J; see, for example, Lepski [29], proof of Theorem 3, 
for more details. Thus, in view of Lemma 3, 

H(/i.p)(a;o) < csup^Lihf*^^^ V/iG(0,l]‘^. (36) 


Recall that I Gid, is the projection on the dyadic grid in (0,1]VI of hj^^ given in 
(20) and note that 2“^nV,(j-) > a“^ln(n) for n large enough. Thus, Io[*P] is non-empty 
and one can apply Theorem 1. 

If r{l3,p,'P) = Cmax, then it is obvious that (/3,p) = (/3(™^^),p(“^^)) and that d{V) = d. 
Thus, in view of the definition of the multibandwidths I GP, inf/g-p = Vmax- 
It follows from Theorem 1 and (36) 


where L := sup^^Yd^i- Since rmax = Pmax/d, we conclude that there exists a constant 
C > 0 such that 


[fn, f]<C[ai{cLWl){d+l)+ . 

If r{P,p,'P) < Tmax we solve, for all I gV, the system 


(37) 


r _ T MI) _ Hn) 

L,h, -L.h, 


i,jGl. 


The solution is 


hi = L, 


/ L(/)ln(n) \ 7 R/ 3 .P)/( 27 R/ 3 ,P)+i)i/ft(/) 


v ^ 


L{I) = l[L 


i/5iii) 


iei 


iGljGV. 


(38) 
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It is easily seen that {h,V) G for n large enough. Replacing h by its projection h on 
the dyadic grid Sj, one has (h.V) G .15 [iP] for n large enough. We deduce from Theorem 1 
and (36) 




c sup ^ Li/if 


Pi{i) 


lev 


iei 


sup ^ 
lev 


I ln(n) 


nVi 


hi 


■ CX2 [nI4i 


1 - 1/2 


The assertion of Theorem 4 follows from (37), (38) and (39). 


(39) 


5.6. Lower bound for adaptive minimax estimation and optimal 
rate 


5.6.1. Auxiliary result 

To get the assertion of Theorem 5, we use the following lemma which is due to an oral 
communication with O. Lepski. This result can be viewed as a generalization of Lemma 2. 

Let (/3,p) G (0,/5i„ax]‘^ X [l,oo]‘^ such that r(/3,p,0) >0, P G Cp, L G (OjOo)'^ and 
{f3',p') G (0,/3„iax]‘^ X [1,00]^^ such that r(/3',p',0) >0,V'g CP, L' G lo,oo)'^ be fixed. 


Lemma 4. Set (a„) and (bn) two sequences such that a„, 6n,/'n/an —t cxd, n —?► oo. Sup¬ 
pose that exist /o G N 2 := Np>^d(l3', L',V') and fi G Ni := Np^d(/3,L,V) such that is 

(n) 

absolutely continuous with respect to P^^'^ and 


\fi{xo)- fo{xo)\=an^-, 


E 


dP 


LdP 


(X(fo) 


< 


(40) 


Then, for any q>l, 


lim inf inf 


sup Ef”^{a„|/„(xo) - f(xo)\}'^ + sup {bn\fn(xo) - f(xo)\} 


(fo I 


+ /n '-/GVi ■' feN2 

where infimum is taken over all possible estimators. 
The proof of this lemma is given in the Appendix. 


> 


2 ’ 


5.6.2. Proof of Theorem 5 

(1) Set 7Vi :=Np^diP,L,V), N 2 ■.= Np,,d{P',L',V'), ri := r 
such that 0 < ri < r- 2 . For any r such that — 

satisfying: Vg > 1, 


(P,p,P) and 1-2 ■.= r(l3’,p',T”) 
1 there exists C(r) > 0 


lim inf inf 

/n 


> + 00 


sup Ef”’ 

,/eVi 


ln(n) 


ri/(2ri + l) 


|/n(a;o)-/(a:o)| 
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+ sup Ef^{n'^\fn{xo) - f{xo)\y 
feN2 


>C{t). 


(41) 


Let us prove (41). The proof is based on Lemma 4 where we put 

/ \r'i/(2ri + l) 

a„ := [2 C(t)]-'/« j , &„ := [2C(r)]-'/''n^ 

and the constant C'(t) > 0 will be specified later. 

Similarly to the proof of Proposition 1, set J\f{x) := OiLi 6xp(— /2) and define 

/o(x) := u~^M{x/u), where cr is chosen in such way that 

/o e Np>Al3\L',v') n fVp,d(/3,L/2,lP). 


Let also /i be given in (32). It is obvious that there exists a constant Aq such that 
/i G Ni if An < Aq and 



/ = l,m,c; = 




(42) 


Assumptions of Lemma 4 are, respectively, fulfilled if 


clAn > [2C(r)] 


i/q f ln(n) 
V n 


ri/(2ri+l) 


exp 


ft.,,. 




cl:={crV^)^ ‘^|g(0)rPexp(-a;g^,/2cr2); (43) 

-ri/(2ri + l) 

^ I \ 

< 


Kl=l 


ln(n) 


The latter inequality, in its turn, holds if 
=i^ln(n 

Solving the system 


t:=x c*A T- 


ri 


2ri + l 


2m 


Cn — 


2||g|l2 

fo,l{xoj) 


■ (44) 


= —/ = l,m. 
Cl 


nA^l J =t^ln(n), 




we obtain 


Sl,n = 


Cl 


i/ft, 


ln(n) 






r-i/(2r-i + l) 
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R = 




1=1 


l/{l-l/si-l/2Pi) 


It is easily seen that An,5i^n —>'0,l = l,m,ifn—>- 00 . The choice C'(t) = 

completes the proof of the inequality (41). It follows the assertion (i) of Theorem 5. 

(2) Let us recall the definition of the set ^ x 05, which is the set of “nuisance” param¬ 
eters for the considered problem. 

A:={iP,p)€{0,Pm3.xf x[l,oof- r(^,p,0)>O}, » := 

Let ipn be an admissible family of normalizations and let fn{xo) be V'n-adaptive esti¬ 
mator. Define 

\{(3,p) eA-. lim T„(/3,p)=o|, 

1. n—¥oo ) 

T„(/3,p) := inf T„(/3,p,iP), T„(/3,p,iP) := 

Pe<p Vn{P,p,V) 

where ipn is given in (21). For any 7^ £ put also 

A^^\i)/il)\ := |(/3,p) G A: J[^T„(/3o,Po)Tn(/3,p,T’) =oo,V(/3o,po) S 

In the slight abuse of the notation, we will use later ipnir) instead of '4’n{P,P,'P), r = 
riPyP,!^)- 

For any (/3o,Po) G ['0n/'0n] introduce 


Vo :=arg in^T„(^o,Po,T’), ro := r(/3o,Po,^o)- 
■Pe<p 


(45) 


Let us first note that 0 < tq < r„iax for any {Po,Po) S [ipn/ipn] - Indeed, if rg = r^ax 
then (Po,Po) S ['0„/'0„] contradicts to ipniemax) is a minimax rate of convergence. 
Moreover, for any r G (rojrniax), there exists (/3,p) G A and P G such that r{P,p,V) = 
r. It suffices to choose V such that = r-max = Pmax/\I\, I £V, and 

/3i = r|/|, pi = oo, i = l,...,(i. 

(3) Our goal now is to prove that for any {Po,Po) G A^^'^[ijjn/'P’n] we have 


lim T„(/3o,Po)T„(/3,p,T’) = 00 V(/?,p,T’): rg < r{P,p,'P) < r„ 


(46) 


Set Ng := Np^XPo,Lg,Po) and N := Np^diP,L,P) such that rg < r{P,p,P) < r^ax- 
Applying the inequality (41) with ri = rg, Ni = Ng, r 2 =r and N 2 = N, we get for any 
T satisfying 


ro 


2ro + l 


< r < 


2r+l 


lim inf 

n —^+00 . 


sup Ei”^{V’„^(ro)|/„(xo) - fixo)\y + sup E)”^{n”|/„(xo) - f{xo)\Y 
feNo feN 


A"); 


>C{t). 


(47) 
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Furthermore, by definition of fn{xo) and there exist constants Mo,M > 0 such that 
for all n large enough 

sup Ef'>{^~'^{l3o,Po,'Po)\fn{xo) - f{xo)\y < Mo; (48) 

feNo 

snpEf'^{^-'^{f3,p,'P)\fr,{xo) - /(a;o)|}'* < M. (49) 

feN 


Note that hm„_>oo p” Poj = 0 that follows from (/3o,Po) G as well as 

the definition of Vq. Thus, we obtain in view of (48) that 

lim sup Ei”^{V';(^(ro)|^(xo) -/(xo)!}’’= 0. 

"^°°/eAfo 

It yields together with (47) and (49) that 

lim inf Mn'^i/)ji(/3,p,P) > C'(r). (50) 

n—^+oo 

Recall that '<pn[r) = (ln(n)/n)’'/(^'’+^). Since r < we get for some a > 0 satisfying 
r + a < 27 ^ that rGipnix) < n““ for n large enough. Hence, we obtain in view of (50) 


liminfn “Tji(/3,p,7^) := liminf n “ 

n—>-+oo n—)-+oo 


1pn{l3,P,'P) 

tljn{P,P,'P) 


^ C{r) 
- M 


(51) 


Furthermore, since pniPoiPo^ 'Po) is a minimax rate of convergence, there exists a constant 
Ml > 0 such that 


Tn(/5o,4'o) 


fpn{Po,PO,'Po) ^ ^ Pn{l3o,Po,'Po) 
i’n{Po,PQ,'Po) ~ ^i’niPo,Po,'Po) 


Mi[ln(n)]-’'“/('"“+^) 


(52) 


for all n large enough. We deduce from (51) and (52) that lim„_>oo '^n{l3o,Po)'^n{l3,p,'P) = 
oo. ^ ^ 

(4) Let (/3i,pi) G and {P 2 ,P 2 ) G A^^'^ltpn/'^n] be arbitrary pairs of param¬ 

eters. Let also Vi and P 2 be defined in (45) where (/3o,Po) is replaced by (/3i,pi) and 
{f^ 2 ,P 2 ), respectively. Then necessarily 


r(/3i,pi,T’i) =r(/32,P2,T’2). 


(53) 


Indeed, assume that r(/3i,pi,T^i) < r{f32,P2,'P2)- Noting that T„(/32,P2) = T„(/32,P2,T’2), 
in view of the definition of V 2 we deduce from (46) with (/3i,pi) = (/3o,po) and (/3,p,P) = 
(/32,P2,T’2) that 

Tf'n(/32,P2)-too, n-^ 00 . (54) 

This contradicts to (/32,P2) G The case r(^i,pi,Vi) > r(/32,P2,7^2) is traited 

similarly. 
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(5) We are now in position to prove Theorem 5. 

First, if ^ 0, we deduce from (53) that there exists G (0,rmax) such 

that 

ril3,p,P{i3,p)) = ro V(/3,p) (55) 

Here, as previously, 'P(p,p) := arginfpg^T„(/3,p,P). 

Recall that, for {j3,p,V) G ( 0 , Foo)'^ x [1,00]"^ x 

r(/3,p,P)=inf'yi(/?,p), jp(/3,p) = -—ISP. 

Thus, obviously 

dim(yl^°^ [^n /V'n]) < 2d - 1. (56) 

Next, let 7^* G fp be a partition satisfying = rmax- We deduce from 

(46) that 

2 {(/3,p) e .4: ro < r(/3,p,T’*) < Tmax}, (57) 

where tq is defined in (55). Thus, A^'^[ip/4’] contains an open set of A since (/3,p) 1 —>• 
r{l3,p,P*) is continuous. This together with (56) completes the proof of the theorem. 


Appendix 

A.l. Proof of Proposition 2 

Our goal is to establish a uniform bound for the empirical process {Ci"^(yo)}/t- Note that 
the considered family of random fields is a particular case of the generalized empirical 
processes studied in Lepski [28] . We get the assertions of Proposition 2 from the Theorem 
1 in the latter paper since it allows us to assert that, for any u > 1, g > 1 and any integer 
n > 3 


E^”)| sup < C'(«)(K,g)[nT4(max)] «/^e “, 

W(“’«)(n,h,2/o) 

_ (58) 

:= c(K, s,q)W V + 21n(2 + InGhiyo)) + Mj 

+ 11 V In + +'«|- 

The constants (K,g) and c(K,s,g) are given later. 
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Thus, we only have to check the Assumptions of Theorem 1 in Lepski [28] and to match 
the notation used in the present paper and in the latter one. We divide this proof into 
several steps. 

(1) For our case, we first consider that p = 1, m = s + 1, k = s, ^\{n) = 'Hn \ = 

{yo}, = h and 


G^h) 

c 


^(min) 

nj 

j=T7s, p^'*)(^,h) = max|ln(/Jj)-ln(hj)|. 


Obviously, Assumption l(i) in Lepski [28] is fulfilled. Using Assumption (7) (see Sec¬ 
tion 2.1 of the present paper), we get supp(Ar) C [—1/2,1/2]'* and 


\K(x)-K{y)\<L^^^^\x,-y,\ Vx,yeR7 s||K1|^1Lk > 0. 

1 = 1,S 


Thus, we easily check that, for any h,h' £% 


is) 


and any y € R®, 


\Khiy-yo)-Kh'iy-yo)\ 


< 


IKII 


Vh 


■ V 


IKII 


Vh- 


-is) 


exp{sgi^\h,h')) - 1 -h —^(exp(y[/)(/i, h')) - 1) L 


IK 


It implies that Assumption l(ii) in Lepski [28] holds with 


Do{z) = exp (sz) - 1 -I- 


L 


(s) 

K 


IKII 


X (exp(z) - 1), 


Ds+i = 0, = 0. 


s+l : 


Furthermore, Assumption 3 in Lepski [28] holds with N = 0 and i? = 1 since Il/Yi “ 
ijs+i = {yo} and Assumption 2 in Lepski [28] is not needed since ni = n 2 = n. 

(2) Thus, the application of the Theorem 1 in Lepski [28] is possible. Let us first 
compute the constants which appear in its proof. 


CN,R,m,k = sup 6 ‘‘s 
s>s. 


1-hln 


/9216(s-f 1)(52 

V [^*(< 5 )]^ 


+ sup (5 
J + s>s. 


\ , , /9216(s + l)5 

'+‘"1 i.-wi 


1 + 


■■=Cs 


Cd = se" 


seLj^ 

IIKIU 


CD,b = \/‘2 .Cd V [(2/3)(Gn V 8e)], 


Ai = 4v'2eGc, A 2 = (16/3)(Gz3 V 8e). 
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Next, we have to compute the quantities involved in the description of f)). 


M,ih) < 


1 V In 


V^(max) 

14 


:= [lAAs5-^ + 5<7 + 3 + 36C',] V 1. 


Since Yi,i = l,n, are identically distributed, putting f) = (h, yo), ni = n 2 = n and r = 0, 
we have 


i^n,r(f)) = lV 


\Khiy-yo)\giy)dy 


■= Ghivo), 


Fn= sup G/,(?/o)<lVg||K||J; 


h&n. 




UG’O) in, t)) < (n, h, yo), c(K, s, q) := [(lOGz,) V (48e)]Gi;') ||K||^ 


Here, we have used that A ||K||^ > 1. Thus, we come to the inequality (58) with 
Gi"^(K, g) := c,||K||^(l V g||K||f)42, c, = 2^<^G+^3'i+^riq + l)(Cz,,b)«. 

(3) If n > 3,nVh > ln(n),l < u < q\n{n) and M(h) := 1 V since 1 < 

Ghivo) < ||K||^, one has 


(nVh) ^{M(h) + 2\n{2+ \nGhiyo)) + u} <7{nVh) ^Ghiyo){M{h) + u} 

<7{l + q)\\Kr^. 


(59) 


Put finally a1'^^[K] := c(K, s, q)-\/7{-y7(r+'g)||K||^+1}. Since > 1, the asser¬ 

tion (i) of Proposition 2 follows from (58) and (59). Let us now prove the assertions (ii) 
and (iii) of Proposition 2. 

(4) First, in view of the definition of (n), we get the assertion (ii) from the assertion 

(i) of Proposition 2 since u < q\n{n) and [1V (1 -f q)ai'^^ = 1/2. Here, we have used 

that if K satisfies the assumption (7), see Section 2.1, |K| satisfies it as well and, therefore, 

—(ri) 

Proposition 2(i) is applicable to the process (yo). 

Next, using the trivial inequality |x V a — x V 6| < |a — 6|, x, a, 6 S K, we easily check 
that 


Gh ivo ) < 2 G /1 (yo) + 2 sup 

heSji'’\n) 



(2/o)| - ^Ghivo) 


+ 


V/i e i3^'"j(n). (60) 


Assertion (iii) of Proposition 2 follows from assertion (ii) and (60). 
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A.2. Proof of Lemma 1 


Note first that, for any {h, V) G S) [Cp], any {rj, V') GSj [fp] and any I f) I' GV oV' 


hinr'^Vinr e IJ [j 


(ini') 

I ’ 


m—1 1—1 


ie/n/' 




(Ini' ,171,i) 


■ nVh, > [«|jn/'|] 


where ^ V iGlfM'. 

Set / G Fd[f,*P]. To get the assertions of Lemma 1, we apply Proposition 2 with s = 

5 = //n/S g = f, = K,= 

= Ghj^j,{xo,ini’)i Ghiyo) = Ghj^j,{xo,ini')i (a;o./n/')> 

d"^(2/o)=d”n,_^(2;o,/n/')- _ _ 

Recall that fp := {V oV: V,V' G ip}. In view of the definition of jo[*P], we easily 
check that 

M„(7) M„(Z') 
inl'GVoV' m—1 1—1 


with u = g[l V ln(2™^Vniax/inf/GZ’Pf,( 0 )] G [l,2gln(n)], since V^(i) > and Mn{I) < 
^og2{n), yi G Id- 

Therefore, it follows from the assertion (i) of Proposition 2, since V ii) (in > 

^ Ini' 

inf/g-p Vj^(j), 


(«?■’{ 


hic\if ^-^0 


sup [id”!,,, (a;o,/)| - 


(ini') 


ix(,,ini')U}y'^^' 


< Cl[nKnax]“^''^, 


'■ - E Eid/T’fK.fii''"” 

•Peqj* 


- 2[(3I^I)^11/2 ■ 

2[(3|/|)A1]/2_1 


Similarly, applying Proposition 2(iii) and using the trivial inequality [sup^x^ — 
supj 2 /i]+ < supjxi — ?/i] + , we obtain the assertion (ii) of Lemma 1 with C 2 := 2ci. 
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Next, it is easily seen that 


Gn(xo) ^ 2 



M„(7) Mn(l') 

E E 

m—l 1—1 hr 


sup 
^ L 


ifrv) 


,(^0,/)| - -^Gh, 


,, i^oj) 


^+J 


+ 3 G(xo), 


and that 

Thus, we get assertion (iii) of Lemma 1 from assertion (ii) of Proposition 2 with 


C3 :=12Ad^ 


E 

•Peqj* 


E 

l€VoV 


r^{ 2 qP) 


(K,f)} 


l/{2qP) 


2[(3ld)Al]/2 


2[(3|/|)Al]/2_ 1 


1 


+ 8 (lVf||K||?) 


Similarly, we obtain assertion (iv) of Lemma 1 with 


C4 


:=2 




2[(3|/|)A1]/2_1 


+ 3 (lVf||K||?). 


This completes the proof of Lemma 1. 

A.3. Proof of Lemma 3 

The proof of this lemma is based on the embedding theorem for anisotropic Nikolskii 
classes; see, for example. Theorem 6.9 in Nikolskii [34]. 

Let P' S Cp and I € V o V be fixed. Set ^{I) := 1 — J2kGiWkPk)~^ and /3i{I) '■= 
K{I)l3i>i~^{I), where Ki{I) ■=i-J2kei(Pk^~Pi^)(^k^^ * ^Since k{I) > 0 there exists 
c/ := c/(K, |/|,p/,/) > 0 such that 

^pi,\I\{l3l,Li) C Noo,|7|(/3(7'),C7L7). 

Introduce the family of |/| x |/| matrices Ej := (ei,..., e^, 0,..., 0), j = 1, |/|, and Eq 
is zero matrix. For any {h,p) G (0,1]'^ x [0,1]'^, using a telescopic sum and the triangle 
inequality, we get 


lb 

\^hr,riii^0,l)\ — 'y 
7 = 1 

- fi{xoj + Piu + {hi V Pi - pi)Ej_iu)] du 


J KG){u)[fi{xoj + Piu + {hi V 777 - Pi)Ej 
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For j = put 

^hi,7ji,j{xo,i) ■= / K{uj)[fi{xoj + riiu+{hj\/rii-rii)Eju) 

Jr 

- + riiu + {hi Vrji - r]i)Ej_iu)] duj. 

If rjj > hj, then Bhj,riij{xo,i) = 0, if not we put [u]-^ :=u — ujCj^u € and we have 

^hi,r,i,j{xo,i) = / K{uj)[fi{xoj + rjiu + {hi Vrji- r]i)Eju) 

Jk 

- fi{xo,i + [rjiuY + {hi Vrji - r]i)Ej_iu)] dUj 

+ / K{uj)[fi{xoj+ [r]iuY + {hiW r]i-r]i)Ej_iu) 

Jm 

- fi{xoJ + Viu + {hi Vr]i - rii)Ej_iu)] duj. 

Thus, in view of the triangle inequality, 

\Bh,,r,:{xo,l)\ < 

fti 7^1 

c := c(K,d,p,Z,T’) = 2||K||f sup sup c/(K, |J|,p/,/). 

V'€'^ 

Here, we have used Taylor expansions of / G Noo,|/|(/3(d),C7L/), the product structure of 
the Fubini theorem that /?(/) G (0,/]"^ and (19); see Section 3.2. We have also used 
that K is compactly supported on [—1/2,1/2] and that ||K||i > 1. 

A.4. Proof of Lemma 4 


Put T„ := a„|/„(a;o) - /o(a;o)| and 

7^(/)[a„,6„,7,/] := sup {an\fn{xo) - f{xo)\y + sup E77^n|7ri(a;o) -/(a;o)|}‘^- 

/GAfi feN2 

It is easily seen that TZn^ [a„, bn, /, /] > 'R-n'^ [a„, /, /] and that 

7^7) K, bn, 7 /] > E^fY^{\Tn - 1|} + —E^;^^{Tn}. 

Here, we have used the triangle inequality and the assumption a„|/i(a;o) — /o(a;o)| = 1- 
Put also Cn := ^ and Zn '■= —We obtain 


7^7^ [an, bn, /, /] > E/" {c„ A Z„} > i 


1 - 
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Here, we have used the trivial equality aAb = ^{a + b—\a — b\}, that } = 1 and the 

Cauchy-Schwarz inequality. Using the third assumption, we also have Ej”^{c„ — Zn}^ < 
Cn — Cn- Finally, for n large enough, 


[an,bnJJ] > 
f 



+ 1 - y/cl -Cn 
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